GPGPU Performance Estimation with Core and Memory Frequency Scaling
نویسندگان
چکیده
Graphics Processing Units (GPUs) support dynamic voltage and frequency scaling (DVFS) in order to balance computational performance and energy consumption. However there still lacks simple and accurate performance estimation of a given GPU kernel under different frequency settings on real hardware, which is important to decide best frequency configuration for energy saving. This paper reveals a fine-grained model to estimate the execution time of GPU kernels with both core and memory frequency scaling. Over a 2.5x range of both core and memory frequencies among 12 GPU kernels, our model achieves accurate results (within 3.5%) on real hardware. Compared with the cycle-level simulators, our model only needs some simple micro-benchmark to extract a set of hardware parameters and performance counters of the kernels to produce this high accuracy.
منابع مشابه
Performance and Power-Aware Classification for Frequency Scaling of GPGPU Applications
The increased adoption of Graphics Processing Units (GPUs) to accelerate modern computational intensive applications, together with the strict power and energy constraints of many computing systems, has pushed for the development of efficient procedures to exploit dynamic voltage and frequency scaling (DVFS) techniques in GPUs. Although previous works have applied several pattern recognition te...
متن کاملCache Power Budgeting for Performance
Power is arguably the critical resource in computer system design today. In this work, we focus on maximizing performance of a chip multiprocessor (CMP) system, for a given power budget, by developing techniques to budget power between processor cores and caches. Dynamic cache configuration can reduce cache capacity and associativity, thereby freeing up chip power, but may increase the miss rat...
متن کاملEvaluating Scalability of Multi-threaded Applications on a Many-core Platform
Multicore processors have been effective in scaling application performance by dividing computation among multiple threads running in parallel. However, application performance does not necessarily improve as more cores are added. Application performance can be limited due to multiple bottlenecks including contention for shared resources such as caches and memory. In this paper, we perform a sc...
متن کاملProcessor-Memory Power Shifting for Multi-Core Systems
Maximum power consumption is an important consideration in server design, as the total power envelope affects cooling costs and can limit performance. One approach to limiting total power is power shifting, managing power budgets among system sub-components to meet an overall total constraint. In this paper, we investigate processor-memory power shifting on a multi-threaded, 32-core commercial ...
متن کاملPerformance Analysis and Tuning for General Purpose Graphics Processing Units (GPGPU)
General-purpose graphics processing units (GPGPU) have emerged as an important class of shared memory parallel processing architectures, with widespread deployment in every computer class from high-end supercomputers to embedded mobile platforms. Relative to more traditional multicore systems of today, GPGPUs have distinctly higher degrees of hardware multithreading (hundreds of hardware thread...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1701.05308 شماره
صفحات -
تاریخ انتشار 2017